首页> 外文OA文献 >Multi-agent Reinforcement Learning with Sparse Interactions by Negotiation and Knowledge Transfer
【2h】

Multi-agent Reinforcement Learning with Sparse Interactions by Negotiation and Knowledge Transfer

机译:基于稀疏交互的多智能体强化学习   谈判和知识转移

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Reinforcement learning has significant applications for multi-agent systems,especially in unknown dynamic environments. However, most multi-agentreinforcement learning (MARL) algorithms suffer from such problems asexponential computation complexity in the joint state-action space, which makesit difficult to scale up to realistic multi-agent problems. In this paper, anovel algorithm named negotiation-based MARL with sparse interactions (NegoSI)is presented. In contrast to traditional sparse-interaction based MARLalgorithms, NegoSI adopts the equilibrium concept and makes it possible foragents to select the non-strict Equilibrium Dominating Strategy Profile(non-strict EDSP) or Meta equilibrium for their joint actions. The presentedNegoSI algorithm consists of four parts: the equilibrium-based framework forsparse interactions, the negotiation for the equilibrium set, the minimumvariance method for selecting one joint action and the knowledge transfer oflocal Q-values. In this integrated algorithm, three techniques, i.e., unsharedvalue functions, equilibrium solutions and sparse interactions are adopted toachieve privacy protection, better coordination and lower computationalcomplexity, respectively. To evaluate the performance of the presented NegoSIalgorithm, two groups of experiments are carried out regarding three criteria:steps of each episode (SEE), rewards of each episode (REE) and average runtime(AR). The first group of experiments is conducted using six grid world gamesand shows fast convergence and high scalability of the presented algorithm.Then in the second group of experiments NegoSI is applied to an intelligentwarehouse problem and simulated results demonstrate the effectiveness of thepresented NegoSI algorithm compared with other state-of-the-art MARLalgorithms.
机译:强化学习在多主体系统中具有重要的应用,尤其是在未知的动态环境中。但是,大多数多主体强化学习(MARL)算法都存在诸如联合状态-作用空间中的指数计算复杂性之类的问题,这使其难以扩展到实际的多主体问题。本文提出了一种基于稀疏交互的基于协商的MARL的anovel算法(NegoSI)。与传统的基于稀疏交互的MARLalgorithms相比,NegoSI采用了均衡概念,使代理可以选择非严格的均衡主导策略配置文件(非严格的EDSP)或元均衡来进行联合动作。提出的NegoSI算法包括四个部分:基于稀疏相互作用的基于平衡的框架,平衡集的协商,选择一个联合作用的最小方差方法以及局部Q值的知识转移。在该集成算法中,分别采用非共享值函数,平衡解和稀疏交互这三种技术来实现隐私保护,更好的协调和更低的计算复杂度。为了评估所提出的NegoSI算法的性能,针对三个标准进行了两组实验:每个情节的步长(SEE),每个情节的奖励(REE)和平均运行时间(AR)。第一组实验是使用六个网格世界游戏进行的,显示了所提出算法的快速收敛性和高可扩展性;然后在第二组实验中,将NegoSI应用于智能仓库问题,仿真结果证明了所提出的NegoSI算法与其他算法相比的有效性最新的MARL算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号